47 research outputs found
Spectral methods and computational trade-offs in high-dimensional statistical inference
Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.St John's College and Cambridge Overseas Trus
A useful variant of the Davis--Kahan theorem for statisticians
The Davis--Kahan theorem is used in the analysis of many statistical
procedures to bound the distance between subspaces spanned by population
eigenvectors and their sample versions. It relies on an eigenvalue separation
condition between certain relevant population and sample eigenvalues. We
present a variant of this result that depends only on a population eigenvalue
separation condition, making it more natural and convenient for direct
application in statistical contexts, and improving the bounds in some cases. We
also provide an extension to situations where the matrices under study may be
asymmetric or even non-square, and where interest is in the distance between
subspaces spanned by corresponding singular vectors.Comment: 12 page
Average-case Hardness of RIP Certification
The restricted isometry property (RIP) for design matrices gives guarantees
for optimal recovery in sparse linear models. It is of high interest in
compressed sensing and statistical learning. This property is particularly
important for computationally efficient recovery methods. As a consequence,
even though it is in general NP-hard to check that RIP holds, there have been
substantial efforts to find tractable proxies for it. These would allow the
construction of RIP matrices and the polynomial-time verification of RIP given
an arbitrary matrix. We consider the framework of average-case certifiers, that
never wrongly declare that a matrix is RIP, while being often correct for
random instances. While there are such functions which are tractable in a
suboptimal parameter regime, we show that this is a computationally hard task
in any better regime. Our results are based on a new, weaker assumption on the
problem of detecting dense subgraphs
Recommended from our members
Spectral methods and computational trade-offs in high-dimensional statistical inference
Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.St John's College and Cambridge Overseas Trus
SpreadDetect: Detection of spreading change in a network over time
Change-point analysis has been successfully applied to the detect changes in
multivariate data streams over time. In many applications, when data are
observed over a graph/network, change does not occur simultaneously but instead
spread from an initial source coordinate to the neighbouring coordinates over
time. We propose a new method, SpreadDetect, that estimates both the source
coordinate and the initial timepoint of change in such a setting. We prove that
under appropriate conditions, the SpreadDetect algorithm consistently estimates
both the source coordinate and the timepoint of change and that the minimal
signal size detectable by the algorithm is minimax optimal. The practical
utility of the algorithm is demonstrated through numerical experiments and a
COVID-19 real dataset.Comment: 26 pages,3 figures, 2 table
Two-sample testing of high-dimensional linear regression coefficients via complementary sketching
We introduce a new method for two-sample testing of high-dimensional linear
regression coefficients without assuming that those coefficients are
individually estimable. The procedure works by first projecting the matrices of
covariates and response vectors along directions that are complementary in sign
in a subset of the coordinates, a process which we call 'complementary
sketching'. The resulting projected covariates and responses are aggregated to
form two test statistics, which are shown to have essentially optimal
asymptotic power under a Gaussian design when the difference between the two
regression coefficients is sparse and dense respectively. Simulations confirm
that our methods perform well in a broad class of settings.Comment: 31 pages, 3 figure
Multiple Identifications in Multi-Armed Bandits
We study the problem of identifying the top arms in a multi-armed bandit
game. Our proposed solution relies on a new algorithm based on successive
rejects of the seemingly bad arms, and successive accepts of the good ones.
This algorithmic contribution allows to tackle other multiple identifications
settings that were previously out of reach. In particular we show that this
idea of successive accepts and rejects applies to the multi-bandit best arm
identification problem